Discriminative speaker adaptation with eigenvoices
نویسندگان
چکیده
Eigenvoice is an effective speaker adaptation approach and capable of balancing the performance and the requirement for a large amount of adaptation data. However, the conventional Maximum Likelihood Eigen-Decomposition (MLED) method in eigenvoice adaptation is based on Maximum Likelihood (ML) criterion and suffers from the unrealistic assumption made by HMM on speech process, so alternative schemes may be more effective to improve the performance. In this paper, we propose a new discriminative adaptation algorithm called Maximum Mutual Information Eigen-Decomposition (MMIED) in which the mutual information between the training word sequences and the observation sequences is maximized. By the use of word lattice, the competing word hypotheses are taken into account to make the estimation more discriminative. MLED, MMIED and Maximum a Posteriori EigenDecomposition (MAPED) which is based on Maximum a Posteriori (MAP) criterion were all experimented to give a comprehensive comparison. Results showed that MMIED outperformed both MLED and MAPED.
منابع مشابه
Rapid speaker adaptation for continuous speech recognition using merging eigenvoices
Speaker adaptation in eigenvoice space is a popular method for rapid speaker adaptation. To improve the performance of the method and to obtain stabilized results, the number of speaker-dependent models should be increased and a greater number of eigenvoices should be re-estimated. However, the huge computation time required to find eigenvoices makes these solutions difficult, especially in a c...
متن کاملUsing genetic algorithms for rapid speaker adaptation
This paper proposes two new approaches to rapid speaker adaptation of acoustic models by using genetic algorithms. Whereas conventional speaker adaptation techniques yield adapted models which represent local optimum solutions, genetic algorithms are capable to provide multiple optimal solutions, thereby delivering potentially more robust adapted models. We have investigated two different strat...
متن کاملEigenvoices for speaker adaptation
We have devised a new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies Principal Component Analysis (PCA) [9] or a similar technique to a training set of T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, which we call “eigenvoices” ...
متن کاملSpeaker Adaptation for Continuous Density HMMs: A Review
This paper reviews some popular speaker adaptation schemes that can be applied to continuous density hidden Markov models. These fall into three families based on MAP adaptation; linear transforms of model parameters such as maximum likelihood linear regression; and speaker clustering/speaker space methods such as eigenvoices. The strengths and weaknesses of each adaptation family are discussed...
متن کاملFast speaker adaptation using a priori knowledge
Recently, we presented a radically new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies a dimensionality reduction technique to T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, the eigenvoices. We constrain the model for new speake...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005